High-dimensional count and compositional data analysis in\\ microbiome studies
نویسندگان
چکیده
منابع مشابه
Statistical Methods for Compositional and Tree-Structured Count Data in Human Microbiome Studies
In human microbiome studies, sequencing reads data are often summarized as counts of bacterial taxa at various taxonomic levels. In this thesis, we develop statistical methods for analyzing such counts data. We first consider regression analysis with bacterial counts normalized into compositions as covariates. In order to satisfy the subcompositional coherence of the resulting model, linear mod...
متن کاملAnalysis of High Dimensional Compositional Data Containing Structural Zeros with Applications to Microbiome Data
This paper is motivated by the recent interest in the analysis of high dimensional microbiome data. A key feature of this data is the presence of ‘structural zeros’ which are microbes missing from an observation vector due to an underlying biological process and not due to error in measurement. Typical notions of missingness are insufficient to model these structural zeros. We define a general ...
متن کاملCompositional Mediation Analysis for Microbiome Studies
Motivated by recent advances in causal inference on mediation analysis and problems in the analysis of metagenomic data, we consider the effect of a treatment on an outcome transmitted through microbes, or compositional mediators. Compositional and high dimensional natures of such mediators make the standard mediation analysis not directly applicable. In this paper, we propose a method for esti...
متن کاملMethods for regression analysis in high-dimensional data
By evolving science, knowledge and technology, new and precise methods for measuring, collecting and recording information have been innovated, which have resulted in the appearance and development of high-dimensional data. The high-dimensional data set, i.e., a data set in which the number of explanatory variables is much larger than the number of observations, cannot be easily analyzed by ...
متن کاملLatent Process Decomposition Of High-Dimensional Count Data
Motivation: Next-generation sequencing (NGS) technologies have become the preferred way of exploring a genome. These data are high-dimensional discrete counts with correlated variables (e.g., genes). We present a novel latent factor model for high-dimensional count data, Latent Process Decomposition (LPD-C), that accounts for the correlations among genes and models the biological hypothesis tha...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: SCIENTIA SINICA Mathematica
سال: 2017
ISSN: 1674-7216
DOI: 10.1360/n012017-00147